A Feature Selection Based Model for Software Defect Prediction
نویسندگان
چکیده
Software is a complex entity composed in various modules with varied range of defect occurrence possibility. Efficient and timely prediction of defect occurrence in software allows software project managers to effectively utilize people, cost, time for better quality assurance. The presence of defects in a software leads to a poor quality software and also responsible for the failure of a software project. Sometime it is not possible to identify the defects and fixing them at the time of development and it is required to handle such defects any time whenever they are noticed by the team members. So it is important to predict defect-prone software modules prior to deployment of software project in order to plan better maintenance strategy. Early knowledge of defect prone software module can also help to make efficient process improvement plan within justified period of time and cost. This can further lead to better software release as well as high customer satisfaction subsequently. Accurate measurement and prediction of defect is a crucial issue in any software because it is an indirect measurement and is based on several metrics. Therefore, instead of considering all the metrics, it would be more appropriate to find out a suitable set of metrics which are relevant and significant for prediction of defects in any software modules. This paper proposes a feature selection based Linear Twin Support Vector Machine (LSTSVM) model to predict defect prone software modules. F-score, a feature selection technique, is used to determine the significant metrics set which are prominently affecting the defect prediction in a software modules. The efficiency of predictive model could be enhanced with reduced metrics set obtained after feature selection and further used to identify defective modules in a given set of inputs. This paper evaluates the performance of proposed model and compares it against other existing machine learning models. The experiment has been performed on four PROMISE software engineering repository datasets. The experimental results indicate the effectiveness of the proposed feature selection based LSTSVM predictive model on the basis standard performance evaluation parameters.
منابع مشابه
A Novel Feature Subset Selection Algorithm for Software Defect Prediction
Feature subset selection is the process of choosing a subset of good features with respect to the target concept. A clustering based feature subset selection algorithm has been applied over software defect prediction data sets. Software defect prediction domain has been chosen due to the growing importance of maintaining high reliability and high quality for any software being developed. A soft...
متن کاملChoosing the Best Classification Performance Metric for Wrapper-based Software Metric Selection for Defect Prediction
Software metrics and fault data are collected during the software development cycle. A typical software defect prediction model is trained using this collected data. Therefore the quality and characteristics of the underlying software metrics play an important role in the efficacy of the prediction model. However, superfluous software metrics often exist. Identifying a small subset of metrics b...
متن کاملChoosing software metrics for defect prediction: an investigation on feature selection techniques
The selection of software metrics for building software quality prediction models is a search-based software engineering problem. An exhaustive search for such metrics is usually not feasible due to limited project resources, especially if the number of available metrics is large. Defect prediction models are necessary in aiding project managers for better utilizing valuable project resources f...
متن کاملFSCR: A Feature Selection Method for Software Defect Prediction
Prediction the number of faults in software modules can be more helpful instead of predicting the modules being faulty or non-faulty. Some regression models have been used for predicting the number of faults. However, the software defect data may involve irrelevant and redundant module features, which will degrade the performance of these regression models. To address such issue, this paper pro...
متن کاملMetaheuristic Optimization based Feature Selection for Software Defect Prediction
Software defect prediction has been an important research topic in the software engineering field, especially to solve the inefficiency and ineffectiveness of existing industrial approach of software testing and reviews. The software defect prediction performance decreases significantly because the data set contains noisy attributes and class imbalance. Feature selection is generally used in ma...
متن کامل